%File: intro.tex
%Date: Fri Jan 03 22:40:59 2014 +0800



\section{Introduction}
\textbf{Speaker recognition} is the identification of the person who is speaking by characteristics
of their voices (voice biometrics), also called voice recognition. \cite{SRwiki}

A \textbf{Speaker Recognition} tasks can be classified with respect to different criterion:
Text-dependent or Text-independent, Verification (decide whether the person is he claimed to be) or
Identification (decide who the person is by its voice).\cite{SRwiki}

Speech is a kind of complicated signal produced as a result of several transformations occurring at
different levels: semantic, linguistic and acoustic.
Differences in these transformations may lead to differences in the acoustic properties of the signals.
The recognizability of speaker can be affected not only by the linguistic message
but also the age, health, emotional state and effort level of the speaker.
Background noise and performance of recording device also interfere
the classification process.

Speaker recognition is an important part of Human-Computer Interaction (HCI).
As the trend of employing wearable computer reveals,
Voice User Interface (VUI) has been a vital part of such computer.
As these devices are particularly small, they are more likely to lose and be stolen.
In these scenarios, speaker recognition is not only a good HCI,
but also a combination of seamless interaction with computer and security guard
when the device is lost.
The need of personal identity validation will become more acute in the future.
Speaker verification may be essential in business telecommunications.
Telephone banking and telephone reservation services will develop rapidly
when secure means of authentication were available.

Also,the identity of a speaker is quite often at issue in court cases.
A crime victim may have heard but not seen the perpetrator,
but claim to recognize the perpetrator as someone whose voice was previously familiar;
or there may be recordings of a criminal whose identity is unknown.
Speaker recognition technique may bring a reliable scientific determination.

Furthermore, these techniques can be used in environment which demands high security.
It can be combined with other biological metrics to form a multi-modal authentication system.

In this task, we have built a proof-of-concept text-independent speaker recognition system with
GUI support. It is fast, accurate based on our tests on large corpus.
And the gui program only require very short utterance to quickly respond.
The whole system is fully described in this report.
This project is developed at Git9\footnote{Git hosting service of the department of CST, Tsinghua Univ., currently maintained by Yuxin Wu. See
  \url{http://git.net9.org}},
and is also hosted on github\footnote{See \url{https://github.com/ppwwyyxx/speaker-recognition}}.
The repository contains the source code, all documents, experiment log, as well as a video demo.
The complete pack of this project also contains all the intermediate data, models, recordings, and 3rd party libraries.
